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Abstract 

In this paper we examine the importance of the choice of metric in path coupling, and 
the relationship of this to stopping time analysis. We give strong evidence that stopping time 
analysis is no more powerful than standard path coupling. In particular, we prove a stronger 
theorem for path coupling with stopping times, using a metric which allows us to restrict analysis 
to standard one-step path coupling. This approach provides insight for the design of non- 
standard metrics giving improvements in the analysis of specific problems. 

We give illustrative applications to hypergraph independent sets and SAT instances, hyper- 
graph colourings and colourings of bipartite graphs. In particular we prove rapid mixing for 
Glauber dynamics on independent sets in hypergraphs whenever the minimum edge size m and 
degree A satisfy m > A + 2, and for all edge sizes when A = 3. Previously rapid mixing was 
only known for m > 2 A + 1. This result leads to approximation schemes for monotone SAT 
formulae in which the maximum number of occurrences of a variable (A) and the minimum 
number of variables per clause (m) satisfy the same condition. For Glauber dynamics on proper 
colourings of 3-uniform hypergraphs we prove rapid mixing whenever the number of colours q 
is at least [| A + l] . Previously the best known result was for q > 1.65A and A > A for some 
large A . Finally we prove rapid mixing of scan dynamics (where the order of vertex updates is 
deterministic) for proper colourings of bipartite graphs whenever q > /(A), where /(A) — > /3A, 
as A — > oo, and (3 satisfies ^e 1 /' 3 = 1, (/3 « 1.76). This gives rapid mixing with fewer colours 
than Vigoda's 11A/6 bound [22], whenever A > 31, and equals this bound for A > 14. 
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1 Introduction 



Path coupling [H] has proved to be a useful technique for analysing Markov chains. Analysis is 
carried out relative to a chosen metric on the state space, for example the Hamming distance on 
the independent sets in a graph or hypergraph. The limitations of the analysis are always caused 
by certain "bad" configurations. But these configurations may be unlikely in a typical realisation of 
the chain. Consequently, path coupling has been augmented by other techniques, such as stopping 
time analysis. See [2J 1101 1151 118j for some applications of this technique. A general theorem for 
applying stopping times was proved in and improved somewhat in |2j. 

The stopping time approach is applicable when the bad configurations have a reasonable prob- 
ability of becoming less bad as time passes. For example, the bad configurations for the Glauber 
dynamics on hypergraph independent sets involve almost full edges containing the change vertex. 
(See [2] for details.) However, it seems likely that the number of occupied vertices in these edges 
will have been reduced before we must either increase or decrease the distance between the coupled 
chains. This observation allows a greatly improved analysis j2]. 

The stopping time approach is a multistep analysis, and appears to give a powerful extension of 
path coupling. However, in this paper we provide strong evidence that the stopping time approach 
is no more powerful than single-step path coupling. We observe that, in cases where stopping times 
can be employed to advantage, equally good or better results can be achieved by using a suitably 
tailored metric in the one-step analysis. The intuition behind the choice of metric will be illustrated 
with several examples. 

In fact, our first example is a proof of a theorem for path coupling using stopping times, relying 
on a particular choice of metric which enables us to work with the standard one-step path coupling. 
The resulting theorem is stronger than those in [2J El ■ The proof implies that all results obtained 
using stopping times can just as well be obtained using standard path coupling and the right choice 
of metric. This does not immediately imply that we can abandon the analysis of stopping times. 
Determining the metric used in our proof involves bounding the expected distance at a stopping 
time. However the proof does suggest that it may be better to carry out one-step analysis using a 
metric indicated directly by the stopping time intuition. 

With this insight, we revisit the Glauber dynamics for hypergraph independent sets (or equiv- 
alently, satisfying assignments of monotone SAT formulas), and hypergraph colourings, analysed 
in [2| using stopping times. We find that we are able to obtain considerably stronger results than 
those obtained in |2j, using metrics inspired by the stopping times considerations but then opti- 
mised to give the best results. The technical advantage arises from the possibility of using linearity 
of expectation where stopping time analysis must use concentration inequalities and union bounds. 

We note that this paper does not contain the first uses of "clever" metrics with path coupling. 
See El f° r examples. But we do give the first widely applicable rationale for choosing a good 
metric. While there have been instances in the literature of optimising the chain j!21 I22|. the only 
previous analysis of which we are aware which uses optimisation of the metric appeared in |17j . 

The organisation of the paper is as follows. In section [2] we prove a better stopping time 
theorem than previously known, using only standard path coupling. In section |3] we give our 
improved results for sampling independent sets in hypergraphs, and in section 0J applications to 
counting the number of satisfying assignments in monotone SAT formulas. In section |S] we give 
improved results for sampling colourings of 3-uniform hypergraphs. Finally, in section ||3 we give a 
completely new application, to the "scan" chain for sampling colourings of bipartite graphs. For 
even relatively small values of A, our results improve Vigoda's |22| celebrated 11A/6 bound on the 
number of colours required for rapid mixing. 
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2 Path coupling and stopping times 

We first deal with the most useful and applicable case, in which the stopping time for a pair of 
coupled chains is the first time that the distance between the two chains changes. This simplifies 
the proofs and makes the thrust of the argument clearer. In Section 12.21 we do deal with more 
general stopping times, however it should be noted that so far all applications of stopping times 
results in path coupling have only used this simple form of stopping time. 

2.1 Distance-change stopping time 

Let A4 be a Markov chain on state space £1. Let d be an integer valued metric on SI x !1, and let 
(X t ,Y t ) be a path coupling for M.. We define T, a stopping time for the pair (X t ,Y t ) £ S, to be 
the smallest t' > t such that d(X t ',Y t i) ^ d(Xt, Yf). We will define a new metric d' such that if we 
have contraction in the metric d at the stopping times, then we have contraction in the metric d' 
at every step which has a positive probability of being a stopping time. 

Let a > be a constant such that E[d(XTi, Yr t )] < ad(X t ,Y t ) for all (X t ,Y t ) in S. If a < 1, 
then for any (X t , Y t ) £ <S, we simply define d' as follows. 

d'(X t ,Y t ) = (l-a)d(X t ,Y t )+K[d(X Tt ,Y Tt )]<d(X t ,Y t ). (1) 

The metric is extended in the usual way to pairs (X t ,Y t ) ^ S, using shortest paths. See, for 
example, We will apply path coupling with the metric d' and the original coupling. First we 
show a contraction property for this metric. 

Lemma 2.1. IfE[d(X Tt ,Y Tt )} < ad(X t ,Y t ) < d(X t ,Y t ) for all (X t ,Y t ) in S, then 

E[d'(X k ,Y k ) \X ,Y } < (1 - (l-a)Pr(To < k))d'(X ,Y ). 

Proof. We prove this by induction on k. It obviously holds for k = 0, since To > 0. Using 1_4 to 
denote the 0/1 indicator of any event A, we may write (0) as 

d'(X ,Y ) = (l-a)d(X ,Y )+E[d(X Tk ,Y Tk )t To>k }+E[d(X To ,Y To )t To < k }, (2) 

since if To > k then T k = Tq. Similarly, we have 

E[d'(X k ,Y k )} = E[d'(X k ,Y k )t To>k ]+E[d'(X k ,Y k )t To < k ] 

= (1 - a)E[d{X k ,Y k )t To>k ] +E[d(X Th ,Y Tk )t To>k ] +E[d'(X k ,Y k )t To < k }. 

= (1 - a)E[d(X ,Y )t To>k ] +E[d(X Tk ,Y Tk )l To>k ] +E[d'(X k ,Y k )t To < k }. (3) 

Subtracting (0) from ®, we have 

E[d'(X k ,Y k )]-d'(X ,Y ) = -(1 - a)E[d(X ,Y )t To < k ] + E[(d\X k ,Y k ) - d(X To ,Y T J)t To < k ]. 

For To < k, since k — To < k — 1 the inductive hypothesis implies E[d' (X k ,Y k ) \Xt ,Yt ] < 
d'(XT ,Yr ) < d(XT Q ,Yr ), (if (X k ,Y k ) S this follows by linearity). Hence we have 

E[d'(X k ,Y k )]-d'(X ,Y ) < -(l-a)E[d(X ,y )lr <fe], 
The conclusion follows, since E[d(X , Y )t To < k ] = Pr(T < k)d(X , Y ) > Pr(T < k)d'(X , Y ). □ 
We may now prove the first version of our main result. 
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Theorem 2.2. Let M be a Markov chain on state space $7. Let d be an integer valued metric on 
Q, and let (X t ,Y t ) be a path coupling for KA. Let T t be the above stopping times. Suppose for all 
(Xq, Yq) G S and for some integer k and p > 0, that 

(i) Pr[T < k] > p, 

(ii) E[d(X To ,Y To )/d(X ,Y )} < a < 1. 
Then the mixing time t(s) of M. satisfies 



k , / eD \ 

-(e) < — In — . 

w ~ p(l-a) \e(l-a)J 



p(l-a) Ve(] 
where D = max{d(X, Y) : X, Y G n}. 

Proof. ^From Lemma 12. 11 d' contracts by a factor 1 — (1 — a)p < e~^ 1_a ) p for every k steps of M. 
Note also that d' < D. It follows that, at time r(e), we have 

Md'(X Y )] n P -( 1 - Q )P T / fc 

Pr(X r + Y T ) < E[d(Jf r ,y r )] < E[d ) Ar ' YrjJ < < e, 

1 — a 1 — a 

from which the theorem follows. □ 

If 1 — a is small compared to e, it is possible to do better than this. We will need the technical 
Lemma 12.31 below, which says that we will not have to wait too long for a stopping time to occur. 



Lemma 2.3. If M. satisfies the conditions of Theorem \2.S\ then Pv[T t > t + t'] < (1 — p) 



Lf/fcj 



Proof. We prove this by induction on t' . It clearly holds for all t and t! < k since [t'/k\ = 0. 
Suppose inductively that ~Pr[T t > s + 1] < (1 -p)L s / fc J for all t and s < t'. Then, if t' > k, 

Pr[T t >t + t']= Pr[T t > t + t' - k and T t+t ,_ k >t + t'] 

= Pr[T t > t + t' - k] Pr[T t+t /_ fc > t + 1' \T t >t + t' -k\. 

Since the process is Markovian, and by condition (i), 

Pr[T t+t /_ fe > t + t' | T t > t + t' - k] < max{Pv[T t+tl _ k >t + t'}: {X t+t ,_ k , Y t+f _ k ) G S} 

= max{Pr[T > k] : (X ,Y ) e S} 
< 1 -p. 

By the inductive hypothesis this gives 

Pr[T t >t' + t}< (1 - p )L(*'- fc )AJ(i - p ) = (i-p)Lt'AJ. □ 

Theorem 2.4. Let Ad be a Markov chain on state space £1. Let d be an integer valued metric on 
f2 x f2 ; and Zet be a path coupling for A4. Let Tt be the above stopping time. Suppose for 

all (Xq, Yq) G S and for some integer k and p > 0, that 

(i) Pr[T < k] > p, 

(ii) E[d(X To ,Y To )/d(X ,Y Q )} < a < 1. 
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Then the mixing time r(e) of M. satisfies 



, . fc(2 - a) , /2eL>\ 
w/iere D = max{d(X, y):I,yefl}. 

Proof. Let X t = Zq , Zq , . . . , Zq = Y t be a shortest path from X 4 to 5^ in the metric d', such that 



(zj,z5 +1 )e5(i = o, 



1). If U is the stopping time for (Zq, Zq ) then, using Lemma EOl 



Pr(X w ^ F w I X tj y t ) < Pr(3i : 4 ± Z* +1 ) 

< Pr{3i : Z{ ^ Z^ +1 or t; > t') 



r-l 



< ^(E[d(Zi j ,Z^ 1 )]+Pr(t,>0) 



i=o 

r-l 



i=0 



< 5^ d^Z^ 1 ) + (!-#' 



< d'(x t ,y t ) + J D(i-p) 



Lf/fcJ 



Hence 



Pr(X t+t , + Y t+V ) < E[d'(X t ,Y t )} + D(l -p)^, 

< £, e -(l-aMt/*J +£)(!- p)L«7*J. 



: Die 



-(l-a)pL*/fcJ 



+ e 



-pLf/fcj 



)• 



ln(2£>/e) 
p(l - a) 



and t > k 



ln(2£>/e) 
P 



□ 



Therefore 

Pr(X t+t , / F t+t /) < i £ + \e = e, if t > k 
The statement of the theorem now follows easily. 
2.2 General stopping times 

We now extend the results proved in this section to incorporate stopping times other than the first 
time at which the distance changes. In order to make sense in the context of path coupling, the 
stopping times must satisfy the following conditions. 

Stopping time conditions: 

1. There must be a stopping time T(Xq,Yq) defined for each pair (Xq,Yq) € 5, such 
that E[d(X To{XtY) ,Y To{xx) )} < ad(X ,Y ). 

2. For all (X , Y ) G S we have Pr[T(X , Y ) < k] > p. 

3. The coupling should be Markovian. 

We may assume that for (Xq, Yq) G S if X t = Y t then T(Xq, Yq) < t. Since the future evolution 
of (Xt, Yt) does not depend on the evolution up to time t, by^andlSlit follows that for all t > there 
is a stopping time T t (X, Y) such that if X t = X,Y t = Y then K[d(X Tt ^ x> Y) ^T t (X,Y))] — ctd(Xt,Y t ). 
Moreover, from and 02 it follows that Pr[T t (X,Y) <k + t]>p. 
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When dealing with the first change in distance we had the benefit that for all (Xt,Yt) G S 
and t' > t, if T t (X t ,Y t ) > t! then (X v ,Y t .) G S and also T t ,(X t ,,Y t ,) = T t (X t ,Y t ). This no longer 
necessarily holds. We must therefore be more careful about exactly which stopping time we are 
referring to at any time and regarding any pair of states. 

Let (X t , Y t ) be a coupled evolution of the chain, and let Pt = (X t = Z®, Z\, Z^, . . . , Zf l = Y t ) 
be the path-coupling path from X t to Y t , so that (Z l t ,Zl +1 ) G S for all i,t. We will inductively 
define a set of starting pairs in the paths Pt, t > as follows. 

1. For all i, (Zq, Zq +1 ) is a starting pair. 

2. For each (Z\ x , 1 ) if there is a time to < h and starting pair (Z 3 Q ,Z^ ) G Pt such that 
(Zf 1 ,Zi+ 1 ) is in the subpath of P tl which evolved from (Z J io ,Z J t + 1 ) and T to (Z 3 Q , Z 3 ^ 1 ) > t u 
then [Z\ x , Z^ 1 ) is not a starting pair but to is the starting time associated with (Z\ x , Z^ 1 ) 
and (Z 3 tQ , Z 3 ^ 1 ) is the starting pair associated with {Z l ti , Z^ 1 ). 

3. For each (Zf , Z\~^~ ) such that there is no time and pair as above, then [Z\ x , Z^ 1 ) is defined 
to be a starting pair. Note that in this case there must be a time to < ti and starting pair 
(Z 3 tQ , Z{+ 1 ) G P to such that {Z\ x , Z^ 1 ) is in the subpath of P tl which evolved from (Z 3 Q , Z 3 ^ 1 ) 
andT 4o (Z^,Z^ 1 ) = t 1 . 



For a starting pair (Zl, Z 3 Q ), we define the distance at time t\, to < ti < T to (Z 3 o , Z\ Q ) to be 



d tl (Zj, ^ +1 ) = (1 - a)d(Z{ , ZI+ 1 ) + E 



(4) 



where .Ft is the cr-algebra generated by {(X t >,Y t i) : t' < t}. Thus {Ft ■ t' > 0} is the filtration 
generated by the coupling. The distance at times not in the given range is zero. This is analogous 
to the definition of the new metric in equation At a time t we are interested in the set SVt of 
starting pairs (Zf Q , Zf^ 1 ) for which T to (z{ Q , Z^ 1 ) > t. We define the distance between X t and Y t 
to be 

D(X t ,Y t )= Y, dtiZ^Z^ 1 ). (5) 
(^ ,< 1 )e5P t 

It is clear that if d(Xt,Yt) ^ then D(Xt,Yt) > (1 — a). We now prove a contraction lemma 
analagous to Lemma l'2. II 

Lemma 2.5. Given the stopping times conditions, then for all (Xq,Y$) and all t > 

E[D(X t+k ,Y t+k )\X t ,Y t ] <(l- ^"^ P j 

where 7 is t/ie maximum value of E[d(X To ( X ,Y)i ^t (x.y)) I /dpQ) > ^0) over a// pairs in S and 
evolutions Ft such that t < Tq(X,Y). 

Proof. The set SVt+k is the union of the starting pairs from SVt which did not reach their stopping 
time by time t + k, and those starting pairs arising from a pair in SVt which did stop by time t + k. 
Hence, writing T to for T to (Z 3 to Z 3 t ^ 



P>{Xt + k,Y t+ k) = ^ lr tn >t+kdt+k { Z L > Z tn + ) + ^T tn <t+k <h+k ( z t, , Z l t * 
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where the second sum is over starting pairs arising from the stopping of pair {Z 3 iQ ,Z\ Q ). As in 
Lemma [2.11 we may assume inductively that ~E[dt+k(Z l ti , Z l t ^ 1 ) 
Then, given T t , the expected value of D(X t +ki Yt+k) is 



t < ti < t + k] < d(Z l t Z\ 



E[D(X t+k ,Y t+k )] < 



E 



E[t : 



T t0 >t+kdt+k 



+1> 



< 



( 7 3 + l 



E E 

esvt 



l Tto>t+fc (l-a)d(^ ,Z^ +1 )j + E[d(Z^,Zg 1 )}. (6) 



So subtracting © from © we get 

E[D(X m ,F m )] -D(X,,Y t ) < 



£ 

'to ■ 
< -(l-a)p 



:i- a )nd(z{ o ,z{ o +i )t To < t+k 



E 



(7) 



< 



(1 - «)p 
1 — a + 7 



*o' *o 



The final inequality follows since, by @, we have dt{Zl Q , Z tQ 



< (I 



Q' 



+ 7 )d(^ n ,Z| n +1 ). 



□ 



The 7 term arises because although we have contraction in inequality Q, we need to express 
this as a proportion of D(X t ,Y t ). The expected value at the stopping time is only guaranteed to 
be at most ad at the outset. If we have already evolved, possibly adversely, the expected value at 
the stopping time could be larger than this, and the proportional changes correspondingly smaller. 
However 7 is bounded by the maximum distance (in the original metric) that can occur at the 
stopping time; in practice this is very likely to be a small constant. 

By following the same arguments as in Section I2.1[ with this contraction lemma we obtain the 
following theorem. 

Theorem 2.6. Let Ad be a Markov chain on state space f2. Let d be an integer valued metric on O, 
and let (X t ,Y t ) be a path coupling for A4. Let T(Xq,Yq) be stopping times satisfying the stopping 
times conditions. Then the mixing time r(e) of M satisfies 



t(e) = O 



fc(l-a + 7), ( D 

1 ^ — l n — 

p(l — a) V e 



Remark 2.7. One of the most interesting features of Theorems 12.41 and 12.61 is that their proofs 
employ only standard path coupling (applied to the fc-step chain), but with a metric which has 
some useful properties. Thus, for any problem to which stopping times might be applied, there 
exists a metric from which the same result could be obtained using one-step path coupling. 

Remark 2.8. Stopping times condition [2] may appear a restriction, but appears to be naturally 
satisfied in most applications, even with k = 1. The alternative, though less natural, assumption 
of uniformly bounded stopping times JH] is also included. (See Remark 12. 91 ) 

Remark 2.9. We may compare this stopping time theorem with those in The main result 

of ^3 Theorem 3] concerns bounded stopping times, where Tq < M for all (Xq,Yq) £ S, and 
gives a mixing time of 0(M(1 — a) _1 logZ?). By setting k = M and p = 1 in Theorem 12.41 we 
obtain the same mixing time up to minor changes in constants, but with a proof that does not 
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involve defining a multistep coupling. For unbounded mixing times, [151 Corollary 4] gives a bound 
0(E[T](1 — a)~ 2 W log D) by truncating the stopping times, where W denotes the maximum of 
d(Xt,Yt) over all (Xq,Yq) G S and t < T. In most applications E[T] < k/p, so in Theorem 12.41 
we obtain an improvement of order W(l — a) -1 . By comparison with |2j, we obtain a more 
modest improvement, of order log Wlog(D(l — a)^ 1 )/ log D. For the more general stopping times, 
comparing Theorem [221 and |15| Corollary 4], we obtain an improvement of order ^tj~^ ■ It should 
be noted that 7 < W. 

Remark 2.10. Further improvements to Theorem 12.41 seem unlikely, other than in constants. The 
term k/p must be present, since it bounds a single stopping time. A term 1/(1 — a)log(D/e) = 
0(log a (-D/e)) also seems essential, since it bounds the number of stopping times required. Likewise 
improvements to Theorem 12.61 are likely restricted to changing the dependence on 7, although it 
seems plausible that some dependence is required. 

3 Hypergraph independent sets 

We now turn our attention to hypergraph independent sets. These were previously studied in j2j. 
Let Ti = (V, £ ) be a hypergraph of maximum degree A and minimum edge size m. A subset S C V 
of the vertices is independent if no edge is a subset of S. Let Vt(7i) be the set of all independent 
sets of Tt. We define the Markov chain M(7i) with state space by the following transition 

process (Glauber dynamics). If the state of Ai at time t is Xt, the state at t + 1 is determined by 
the following procedure. 

1. Select a vertex v G V uniformly at random, 

2. (i) if v G X t let X t+1 = X t \{v} with probability 1/2, 

(ii) if v $ Xt and X t U {v} is independent, let X t +i = X t U {v} with probability 1/2, 

(iii) otherwise let Xt+i = Xt- 

This chain is easily shown to be ergodic with uniform stationary distribution. The natural coupling 
for this chain is the "identity" coupling, the same transition is attempted in both copies of the 
chain. If we try to apply standard path coupling to this chain, we immediately run into difficulties. 
Consider a state of the coupled chain at a time t, (X t , Y t ), such that Y t = X t U {w}, where w £ X t 
(the change vertex) is of degree A. An edge e G £ is critical in Y t if it has only one vertex z G V 
which is not in Yt, and we call z critical for e. If each of the edges through w is critical for Yt, then 
there are A choices of v in the transition which can be added in Xt but not in Yt. Thus the change 
in the expected Hamming distance between X t and Y t after one step could be as high as ^ — ^, 
and we obtain rapid mixing only in the case A = 2. 

For (<t, aU{w}) G S, let Ei(w, a) be the set of edges containing w which have i occupied vertices 
in a. Using a result like Theorem 12.21 above, it is shown in [3] that, for the stopping time T given 
by the first epoch at which the Hamming distance between the coupled chains changes, 

m-2 

E[d Ham (X T ,Y T \X = a,Y = aU {w})} < 2 Pi \Ei\ < 2 Pl A, 

i=0 

where the pi is the probability that d(Xx, Yt) = 2 if w is in a single edge with i occupied vertices. 
Since p\ < l/(m — 1), we obtain rapid mixing when 2A/(m — 1) < 1, i.e. when m > 2A + 1. See 
for details. 
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The approach of Section |2] would lead us to define a metric for which the distance between a 
and <tU {w} is (1 - 2 Pl A) + Pi\ E i\- % Lemma EH we know that this metric contracts in 

expectation. However, prompted by the form of this metric, but retaining the freedom to optimise 
constants, we will instead define the new metric d to be 

m-2 

d(a,a U {w}) = y~] Cj\Ej\, 

i=0 

where 0<Cj<l(0<i<m — 2) are a nondecreasing sequence of constants to be determined. 
Using this metric, we obtain the following theorem. 

Theorem 3.1. Let A be fixed, and let H be a hypergraph such that m > A + 2 > 5, or A = 3 and 
m>2. Then the Markov chain A4(Ti) has mixing time O(nlogn). 

Proof. Without loss of generality, we take c m _2 = 1 and we will define c_i = co,c m _i > A + 1, 
Note that c_i has no real role in the analysis, and is chosen only for convenience, but c m _i is 
chosen so that c m _i — c m _2 > A > d(<r, a') for any pair (<x, a') £ S. We require q > for all i so 
that we will always have d(er, a') > if a ^ a' . 

Now consider the expected change in distance between a and a U {w} after one step of the 
chain. 

If w is chosen, then the distance decreases by X^S) 2 °i l-^l- ^he contribution to the expected 
change in distance is — J- Y^H=q 2 c i\Ei\- 

If we insert a vertex v in an edge containing w, then we increase the distance by (cj+i — q) > 
for each edge in containing v. This holds for i = 0, . . . , m — 2, by the choice of c m _i = A + l. 
Let U be the set of unoccupied neighbours of w, and Vi{v) be the number of edges with i occupants 
containing w and v. Then the contribution is 

. m— 2 - m— 2 

X] J" E "iWte+i - c = 2^ E( Q+1 ~ c ')( m ~ * ~ ^1^1' 
veu i=o i=o 

since 

E ^ = EE = E E 1 = E ( m - 4 - = ( m - * - ^i^i- 

If we delete a vertex u in an edge containing w, then we decrease the distance by (q — Cj_i) for 
each edge in i£j containing v. This holds for i = 0, . . . , m — 2, by the choice of c_i. Let O be the 
set of occupied neighbours of w, and Ui(v) be the number of edges with i occupants containing w 
and v. Then the contribution is 

j m— 2 ^ m— 2 

~ E 2^ E - c *-i) = E ( Ci ~ <=i— 

„eO i=o i=o 

since, as for U above, 

E^) = EE 1 *=E E i = E i= w 

Let do = d(<7, <7 U {u>})) an d let di be the distance between the evolved states after one step of 
the chain. The change in expected distance E[di — do] satisfies 
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m— 2 m— 2 m—2 



2nE[d! - d ] < -2^2 C *\ E *\ + " c *)( m " * " l )\ E i\ ~ ~ c *-MVi 

i=0 i=0 i=0 

m—2 

= (-2cj + (m - i - l)(ci +1 - Ci) - i(ci - Cj_i)) 
i=0 
m-2 

= (ici_i - (m + 1)q + (m - i - l)c i+1 ) |^|. 



i=0 

We require E[di — do] < —7, for some 7 > 0, which holds for all possible choices of Ei if and 
only if (m — % — l)cj + i — (m + l)cj + icj_i < —7 for all i = 0, 1, . . . , m — 2. Thus we need a solution 
to 

icj_i - (m + l)cj + (m — i - l)c i+ i < -7 (i = 0, . . . , m — 2), (8) 

= C_i < C < C\ < ■ ■ ■ < C m _ 3 < C m _ 2 = 1, 

c m -i > A + l, 7 >0, 

with 7 > if possible. Adding (JHJ) from i to m — 2 gives 

icj_i - (m - i)ci - (m - l)c m _ 2 + c m _i < -(m-i-l)7 (i = 0, . . . ,m - 2), 
i.e. ici-i < (m — i)ci + (m — A — 2) — (m — i — 1)7 (i = 0, . . . , m — 1). (9) 

Substitute Uj = ("V )cj in ©, so ii m _i > A + 1, u m _2 = m — 1 and u_i = 0. Then we have 

m — A — 2 + 7/ m\ / m — 1\ 
< itj H . -7 • (i = 0, ... ,m- 2). 

Using the boundary condition u_i = 0, these give 

Ef m — 1\ m — A — 2 + 7 ^-v / m\ 

j=o v 7 j=o v J 7 

The boundary condition u m _2 = m — 1 now implies 

2 m — 1 — m / . m(m — 1) 

7 < 7— ^rr— T— r U-A-2 + 



(m - 2)2 m - 1 + 1 V 2 m - 1 - m 

Let 

„ . , m(m — 1) 

f(m) = m - 2 + 



2 m - 1 - m 

then we can have 7 > if and only if f(m) > A, and 7 > if and only if f{m) > A. Then 



Ei fm—l\ m— A— 2+7 /m\ 
j=0 \ j ) m 2^j=0 \ j ) 



(i = 0,...,m-2). 



In order to satisfy the conditions of (JHJ), we need to establish that < Cj < Cj +1 (i = 0, . . . , m — 3). 



Ei /m— 1\ m— A— 2+7 v^* ff>i\ 
j=0 I j i m 2^j=0 \ j ) 

rf 1 ) 



(i = 0,...,m-2) 
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(m—l\ (m\ 

2^j= { j ) ~ K 2^j=o{j) , m-A-2 + 7 

7 — —, j- — , where K ~ 

1 (m—l\ 



7" 



3-1 ■ 



v-u /m— 1\ ( (m—V\ , /m—1 
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Ei fm—l\ I v-n (m—V\ , v^'i— 1 (m—1 

j=0 \ j ) ~ K I 2^j=0 { j ) "+" 2^j=0 \ j 



(i) 

s}=o(7 1 )--(2E; =0 (7 1 )-rr 1 

7 



"T 1 ) 

m— 1\ i 1\ 



(l-2«)£}=o(7 )+«( i ) 
7 (1 -2k) V j +7 ., 



= 7(1 - 2K)gi + 7K, say. 
Now 2k < 1 is equivalent to 2(m — A — 2)/(m — 2) < 7, i.e. 

2(m-A-2) (2 m -l-m)(m- A - 2) + m(m - 1) 
m - 2 < (m - 2)2 m - 1 + 1 

which holds for all A > 0. Finally, is strictly increasing, since 

m— i v-vi— 1 (m— 1\ 
9i-i i 2^j=o [ j ) 

* " >:} uC", 1 ) 

v^i to— i (m— 1\ 

£-o=i j vj-i; 

v-« /m— 1\ 



Z_/j=l I j J . . 
< — ' , , since 7 < i. 



< l. 



Hence q is strictly increasing. It only remains to verily that cq > 0. This is clearly equivalent to 
7 > (m — A — 2)/(m — 1). If m = A + 2, it follows from 7 > 0. If m > A + 2, it follows from 
7 > 2(m — A — 2)/(m — 2), which we have already established. 

If m > 5 then m(m — 1) /(2 m — 1 — m) < 1, so we will have f(m) > A exactly when m > A + 2. 
For smaller values of m, 



m 


2 


3 


4 


f(m) 


2 


Z 2 


13 11 



The new case here is A = 3, m > 4. In any case for which f(m) > A, standard path coupling 
arguments yield the mixing times claimed since we have contraction in the metric and the minimum 
distance is at least cq. Since we can show mixing for A = 3,m < 3 by other means (see |12j). we 
have mixing for A = 3 and every m. □ 
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Remark 3.2. The independent set problem here has a natural dual, that of sampling an edge cover 
from a hypergraph with edge size A and degree m. An edge cover is a subset of £ whose union 
contains V. For the graph case of this sampling problem, with arbitrary m, see [I]. By duality this 
gives the case A = 2 of the independent set problem here. 

4 Satisfying assignments of SAT instances 

The set of independent sets in a hypergraph with edge size m and degree A corresponds in a natural 
way to the set of satisfying assignments in a SAT instance with clause size equal to m and number 
of each variable occurrences bounded by A, cf. |12| . The optimisation problems connected to 
small (variable) occurrence number instances of SAT were studied recently in (see also for 
additional references). 

Given a hypergraph TL = (V, £) with n vertices, k hyperedges, and edge size m and degree A. 
We construct an mSAT formula /, over n variables X = {x\, . . . , x n } corresponding to vertices of TL 
as follows. If e = . . . , v m } is an hyperedge of TL, we associate with e a clause C e = V2=i ^ii an d 
furthermore we set / = A e e£ Notice- that the number of satisfying assignments of / is precisely 
the same as a number of all independent sets of TL, and a number of occurrences of variables in / 
is less than or equal to the degree of TL. We can moreover replace the literals x~i by Xi, to obtain a 
monotone mSAT formula /' with the same number of variable occurrences as / and with the same 
number of satisfying assignments. The above construction is reversable, showing the equivalence 
of corresponding counting problems of hypergraph independent sets and monotone SAT formulas. 

Let us denote by #(m, A)/xSAT the problem of counting number of satisfying assignments in 
monotone mSAT instances with at most A variable occurrences. Theorem 13.11 yields the first 
FPRASs (Fully Polynomial Randomized Approximation Schemes) for a large class of monotone 
mSAT formulas. 

Theorem 4.1. Let A be fixed, and m > A + 2 > 5 ; or if A = 3 then m > 2. Then the associated 
Markov chain A4(TL) yields an FPRAS for the #(m, A) fiSAT problem. 

The above result improves vastly the hitherto known results for approximate counting the 
number of satisfying assignments of general monotone SAT formulas. 

5 Colouring 3-uniform hypergraphs 

In our second application, also from [2], we consider proper colourings of 3-uniform hypergraphs. 
We again use Glauber dynamics. Our hypergraph TL will have maximum degree A, uniform edge 
size 3, and we will have a set of q colours. For a discussion of the easier problem of colouring 
hypergraphs with larger edge size see (Hj- A colouring of the vertices of TL is proper if no edge is 
monochromatic. Let 0,'(TL) be the set of all proper g-colourings of TL. We define the Markov chain 
C(TL) with state space Q'(TL) by the following transition process. If the state of C at time t is Xt, 
the state at t + 1 is determined by 

1. selecting a vertex v € V and a colour k £ {1, 2, . . . , q} uniformly at random, 

2. let X[ be the colouring obtained by recolouring v colour k 

3. if X' t is a proper colouring let X t +\ = X[ 
otherwise let Xt+i = X t . 
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This chain is easily shown to be ergodic with the uniform stationary distribution. For some large 
enough constant Ao, it was shown in [3] to be rapidly mixing for q > 1.65A and A > Ao, using a 
stopping times analysis. Here we improve this result, and simplify the proof, by using a carefully 
chosen metric which is prompted by the new insight into stopping times analyses. If w is the change 
vertex, the intuition in [3] was that edges which contain both colours of w are initially "dangerous" 
but tend to become less so after a time. Thus our metric will be a function of the numbers of edges 
containing w with various relevant colourings. 

Theorem 5.1. Let A be fixed, and let H be a 3-uniform hypergraph of maximum degree A. Then 
if q> |~|A + l] , the Markov chain C(TC) has mixing time 0{n log n). 

Proof. Consider two proper colourings X and Y differing in a single vertex w. Without loss of 
generality let the change vertex w be coloured 1 in X and 2 in Y. We will partition the edges e £ £■ 
containing w into four classes E%, E 2 ,E 3 , E4, determined by the colouring of e \ {w}, as follows: 

El : {1,2} E 2 : {l,i} or {2,i} (2 < ») E 3 : {i,i} (2 < i) E 4 :{i,j} (2 < i < j). 

Instead of using Hamming distance, we will take a new metric defined by 

4 

d(X,Y) = J2a\Ei\, 

i=l 

where 1 = c\ > c 2 > c 3 > C4 > 0, and for convenience cq = A + 1. Note that d(X, Y) < A if X, Y 
have Hamming distance 1. The diameter is therefore at most An in the metric d. 
Arguing as in Section |31 we have 

n^di-do] < -(g-|E3|)(ci|E 1 |+C 2 |E 2 |+C3|E 3 |+C4|E4|) 

+\Ei\( - 2(q — A — 1)( C1 - 02) + 2(c - ci)) 

+\E 2 \( - (q-A- 2) (02 - c 4 ) - (02 - c 3 ) + (c - c 2 ) + (ci - 02)) (10) 
+ 1^3 1 ( - 2(q — A — 2)(c3 - c 4 ) + 4(c 2 - c 3 )) 
+ |£ 4 |(2(C3-C4) + 4( C2 - C4 )). 

If, in (finj). we set 

2( g - A - l)(ci - c 2 ) - 2(c - ci) + ci((? - |E 3 |) = 7 
(g - A - 2)(c 2 - c 4 ) + (c 2 - c 3 ) - (co - c 2 ) - (ci - c 2 ) + c 2 (q - \E 3 \) = 7 
2(5 - A - 2)(c3 - C4) - 4(c 2 - c 3 ) + c 3 (g - \E 3 \) = 7 
-2(c 3 - c 4 ) - 4(c 2 - c 4 ) + c 4 (g - |E 3 |) = 7, 



(11) 



where 7 > 0, we have 



E [di] < d -^ < (l-^)d . (12) 
nq V nq/ 



Note, that if we put q' = q - \E 3 \, A' = A - \E 3 \ in (fTTj). we have 

2(g' - A' - l)(ci - c 2 ) - 2(c - ci) + Cl g' = 7 

(<?' - A' - 2)(c 2 - c 4 ) + (c 2 - c 3 ) - (co - c 2 ) - (ci - c 2 ) + c 2 q' = 7 

2(g' - A' - 2)(c 3 - Ci ) - 4(c 2 - c 3 ) + c 3 g' = 7 

-2(c 3 - c 4 ) - 4(c 2 - c 4 ) + c 4 g' = 7. 



(13) 
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This corresponds to a system like (fTTj) with degree A', q' colours and \E$\ = 0. But, since q'/A' = 
(q — \Es\)/(A — I-E3I) > q/A, the smallest ratio for q/A is given by setting \E^\ = in Also, 
putting C3 = C4 makes the third and fourth equations in 1)13|) identical, so C3 = C4 must be a 
solution. With these simplifications, and putting cq = A + 1, c\ = 1, we have 

2(g- A-l)(l-e 2 )-2A + g = 7 
(q — A — l)(c2 — C4) — 2(1 — C2) — A + C2Q = 7 

-4(c 2 - c 4 ) + c 4 g = 7. 

Now the linear equations (fTTj) may be solved for C2, C4 and 7, giving 

2q - 2A + 1 2g - 3A + 1 2q 2 - g(3A - 1) - 4A 

Ci =1, c 2 = - , C3 = ca = , 7 = . 

2g - A + 1 2g - A + 1 ' 2g - A + 1 

The condition 7 > is equivalent to 



9 > ^ l + x/l + Taf^ • - ^ > riA] + 1 



(3A-i)^ y ' 

Note that we have Cj > (i = 1, . . . , 4) under this condition. Note also that 7 > and hence, using 
(|12l) . the mixing time satisfies 

( \ s 2g 2 - gA + q ( An\ 

r(e) < — ^ ; — 7 : nm . □ 

V ; ~ 2q 2 - q(3A - 1) - 4A V e ) 



6 Colouring bipartite graphs 

Our final application is to colouring bipartite graphs. Several recent papers have used a stopping 
times or "burn in" analysis to prove rapid mixing for Glauber dynamics of graph colouring, starting 
with [HJ. These are largely based upon the idea that although a vertex can have only q — A colours 
with which to be properly recoloured, it is very unlikely for any vertex to have so few colours 
available after a period of "burn in". Subject to more stringent girth and degree restrictions than 
used here, rapid mixing has been proved for fewer colours [H1 I141ITT?] . Here we capture this intuition 
by using a metric which directly incorporates the number of colours available to a vertex. In 
order to simplify the analysis, we do not consider Glauber dynamics here. Instead we prove that a 
Markov chain Scan which uses the same method for recolouring a vertex as Glauber dynamics, but 
recolours the vertices in a deterministic order, mixes rapidly. In order to show this we first prove 
results for a closely related Markov chain, Multicolour, which is of interest in its own right. 

Let G = (V, E) be a bipartite graph with bipartition V\,V2, and maximum degree A. For v G V, 
let M(v ) = {w : {v, w} G E} denote the neighbourhood of v. Let Q = [q] be a colour set, and 
X : V — > Q be a colouring of G, not necessarily proper. Let C(v) = {X(w) : w G M(v)} be the set 
of colours occurring in the neighbourhood of v, and c(v) = \C(v)\. We consider the Markov chain 
Multicolour on colourings of G, which in each step picks one side of the bipartition at random, 
and then recolours every vertex on that side, followed by recolouring every vertex in the other half 
of the bipartition. If the state of Multicolour at time t is X t , the state at time t + 1 is given by 
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Multicolour 

1. choosing r E {1,2} uniformly at random, 

2. for each vertex v € V r , 

(i) choosing a colour q(v) € Q\C(v) uniformly at random, 

(ii) setting X t +i(v) = q(v). (Heat bath recolouring) 

3. for each vertex v £ V\V r , 

(i) choosing a colour q(v) € Q\C(v) uniformly at random, 

(ii) setting X t+1 (v) = q(v). 

Note that the order in which the vertices are processed in steps [2] and El is immaterial. This 
chain is a single-site dynamics intermediate between Glauber and scan. It is easy to see that it 
is ergodic if q > A + 1, and has equilibrium distribution uniform on all proper colourings of G. 
Observe also that it requires considerably fewer random bits than Glauber, and only slightly more 
than scan. We prove the following theorem. 

Theorem 6.1. For q > /(A) the mixing times of SCAN and MULTICOLOUR are 0(log(n)), where 
f is a function such that 

1. /(A) — > (3A, as A — > oo, where [5 satisfies j^e 1 ^ = 1, 

2. /(A) < [11A/6] for A > 14, 

3. /(A) < rilA/6] for A > 31, 

4. in particular /(22) = 40 < [11A/6]. 

We will require the following lemmas. 

Lemma 6.2. For 1 < i < A let Si be a subset of (Q — go) such that mi = \Si\ > q — A. Let Sj be 
selected uniformly at random from Si, independently for each i. Finally let C = {si : 1 < i < A} 
and c = \C\. Then 

(A-l)( 9 -A) 

%-c I si =qi] > l + (g-2) ( 1 ) =a. 



Proof. This follows from 8, Lemma 2.1] with minor adjustments as follows. Let a« = 1 if j G Si 
and otherwise. Thus rrn = ^jg(Q- go ) a «i an( ^ 



e[,-c]=i+ y. n(i~) 



iG(<9-9o) i=1 
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However if we are given that s\ = qi, then 



E[q-c \ s 1 =q 1 ]= 1 + £ J] ( 1 " 

je(Q-?o-?i) *=2 



1 

mi 



> l + (9-2 

> i+te-2)(n(i-^ 



n n(i-£ 

iG(Q-<?0-9l) i = 2 



1 

9-2 



\i=2 

> l+(g_2)fl 



(A-lKg-A) 



Where the final inequality follows because (1 — \jm,j) mi in increasing with rrii and rrii > q — A for 
all i. □ 

Lemma 6.3. For 1 < i < A let Si be a subset of (Q — qo) such that rrii = \Si\ > q — A. Let Sj be 
selected uniformly at random from Si, independently for each i. Finally let C = {si : 1 < i < A} 
and c = \C\. Then 



E 



1 



q-c 



si = qi 



<iu + 

a 



(q — a — l)(a — 1) 
(q-A)(q-2)a 



a 



Proof. We will write c for E[c | si = <?i]. Let Z = so that 



g - c 



c\l-Z 



(14) 



Note that (1 - Z)~ x = Jz£ < |z|. Now 



1 =1 + z+ ^L< 1 + z + («-^ 2 



1 - Z 



1 - Z 



q-A 



Hence 



TC-r/i 7 H i/i , 9-cVarc si = gi Var c «i = gi 

E(l-Z) si = 9i<l + r- -, = l + - ( XT? =T- 

q-A (q-c) 2 (q-A){q-c) 



(15) 



We now turn our attention to bounding Var(c | si = q±). Let c = Ylje(Q-qo) ^ ' wnere Ij 
indicates that colour j is in C. Now, conditional on s\ = q±, we have 

Var( Yl tj)= E Var(/ J ) + 2^Cov(/„4)< £ Var (/,■), 
je(Q-q ) je(Q-q ) j<k je(Q-qo) 

since Ij and Ik are negatively correlated for all j and k. Let pj = Pr(lj = 1), then Ij has variance 
Pj(l — pj) and Ylje(Q-q )Pj = ^- Also note that p qi = 1, hence Var(7 9l ) = 0. By convexity, 
the maximum of 22je(Q-go-qi)PjO- ~ Pi) such that ^je(Q- go -qi)Pj = c - 1 is given by setting 
Pj = (c — l)/(q — 2). Hence, using c = q — a, 



Var(c | si = qi) < (c- 1) 1 - 



c— 1 \ (q — a — \)(a — 1) 



(9-2) 



g-2 



(16) 
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Putting together equations ((Til) . (fT3|) and (JT5|) we have 



E 



1 , 

si = qi 



c 



< 1 / 1+ (g-a-l)(a-l) ! 



□ 



Proof of Theorem \b\l\ We first prove the theorem for MULTICOLOUR. In the path coupling setting, 
we will take S to be the set of pairs colourings which differ at exactly one vertex. Let v be the 
change vertex for some pair (X, Y) G S, and assume without loss that v G V\. The distance 
between X and Y is defined to be d(X, Y) = Yl w eAf(v) q - C x Y ( w ) > where cx,y(w) is taken to be 
min{cx (w) , cy (w)} in the case that they differ. We couple as follows (the usual path coupling for 
Glauber dynamics). If we are recolouring a vertex which is not a neighbour of v, then the sets of 
available colours in X and Y are the same, and we use the same colour in both copies of the chain. 
If we are recolouring a vertex w G Af(v) then there are three cases to consider: 

1. \{X(v),Y(v)} n {X[z) : z G M(w)\{v}}\ = 2. 

The colours X{v) and Y(v) are not available for recolouring w in either copy of the chain, 
hence the sets of available colours are the same, and we use the same colour in both copies 
of the chain. 

2. \{X(v),Y(v)} n {X(z) : z G M(w)\{v}}\ = 1. 

Without loss assume colour X(v) is not available to w in either copy of the chain. Colour 
Y(v) is only available in X. We couple recolouring w in X with any colour other than Y(v), 
with recolouring using the same colour in Y. We couple recolouring win J with colour Y(v), 
uniformly between recolouring w with each available colour in Y. 

3. \{X(v),Y(v)} n {X(z) : z G M(w)\{v}}\ = 0. 

Here colour Y(v) is only available in chain X, and X(v) in only available in Y. We couple 
together recolouring with these colours respectively, and for each other colour (that is available 
to both copies), we recolour w with the same colour in both X and Y. 

Note that in case^ there is no probability of w being coloured differently in the two chains. In the 
other cases, the probability of disagreement at w is q _ c ^ Y ^ ■ 

Let X', Y' be the colourings after recolouring V r (half a step of Multicolour) and X" , Y" be 
the colourings after the full step of Multicolour. If we randomly select V\ to be recoloured first, 
then the two copies of the chain have coupled in X' and Y' since the vertices in V\ have the same 
set of available colours in each chain. 

So suppose that we select V2 to be recoloured first. The only vertices in V2 that have different 
sets of available colours are those which are neighbours of v. Let J\f(v) = {w\, . . . , Wk} and consider 
the path Wq, W\, . . . , Wfc.fi from X' to Y', where for 1 < i < k, W% agrees with X' on all vertices 
except wi, . . . ,Wi which are coloured as in Y' , and Wq = X' and Wk+i = Y' . Then for i < k we 
have 

where l w . indicates whether X' and Y' differ on Wj. Note that Pr[l,,,. = 11 < —, — r. Further- 

more, by the construction of the coupling either conditioning on 1^ = 1 is the same as conditioning 
that Wi-i(wi) = qi, or that Wi(wi) = qi, for some q\. We assume without loss that this is Wj. 
Then for each z G M{wi) — v the selection of colours in Cyy. (z) satisfies the conditions of Lemma 
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since we may take qo = X(z) and q\ as above. For v, there is no colour qo which is necessarily 
unavailable for all its neighbours, since some are coloured as in X' and some as in Y'. Hence we 
use a slightly weaker bound on a and a', given by 

(A-l)(qr-A) 

a v = (q - 1) 1 t- and a v = — 1 + 



A J v a v V (q - A)(q - l)a v 

Hence for » < k, E[d(Wi_i, Wi)] < ^Vk) ^ A ~ + The value of d^'^+i) is sti11 
d(X, Y) since the vertices in Vi have not yet been recoloured. 

Now we consider the vertices in V\. We apply the same analysis as above to each path segment 
Wi-i, Wi, but augment the analysis using the fact that at the time a vertex z G V% is recoloured, 
its neighbours (in V-z) will already have been randomly recoloured. Let the neighbours of Wi be 
Zi,Z2, ■ ■ ■ zi, and consider the path Zq, Z\, . . . Zi + \, where for 1 < j < I, Zj agrees with Wj_i on all 
vertices except z\, . . . , Zj which are coloured as in Wi, and Zq = Wi_i and Zi + \ = Wi. Arguing as 
above, for j < I we have 

d(Z j -i,Z j ) = t : ^ 1 



But now Pr[l Zj . = 1| Wj_i,Wj] < q _ Cw 1 w ^pj ^m- This is similar to equation (|17|). and the same 

argument gives E[l 2j = 1] < q - c ^ Y ( Wi ) a ' ' for z i ^ u and E [ 1 ^ = !] < F^I^T a " if z i = v ' Also ' 
since it depends only on the colouring of V2, we have d(Z\, Z[ + i) = d(Wi_i, Wi). So 

E£ d(^-i, Zj)] < - — -((A - l)ct + o4)(((A - l)a' + <) + 1). 

9-cx,y(u)j) 



Finally note that Wk and W^+i differ only in Vi, so after recolouring V\ they have coupled. Hence 

k l+l 

2 



fc z+i 

E[d(X",y")]= -^^E[d(V l5 Z,)] (18) 



i=l 3=1 
k 



Z \ E """7 — \ ((A " + o4)(((A - l)a' + a' v ) + 1) (19) 

2 ~[ ( 1- CX,Y{Wi) 

A(YV\((A U 'a. ^ (((A-lj^ + ^J + l) , . 

= d(A,Y) ((A- l)a + aj . (20) 

This gives contraction as long as ((A — l)a' + a' v ) is less than 1. For large A, we see that a' and 
a' v both approach ^e^/ q . Hence we have contraction when ^e A/,? < 1. For small values of A it is 
possible to compute the smallest integral value of q for which there is contraction. These values are 
shown in Table ^ When there is contraction, standard path coupling arguments give the mixing 
time bounds claimed. 

We now argue that Scan mixes as rapidly as Multicolour. The Markov chain Scan recolours 

the two sides of the bipartition in order, (Vi, V2), (Vi, V2) The Markov chain Multicolour 

recolours a random side first in each step. However, recolouring the same side twice in a row has 
exactly the same effect as recolouring it once, since vertices in the same side of the bipartition are 
independent. The recolouring given by a run of MULTICOLOUR with order (Vi, V2), (V2, Vi), (Vi, V2) 
has exactly the same result as if the reversed pair was omitted. Hence any randomly chosen 
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A q [11A/6] q/A 



9 


17 


17 


1.89 


10 


19 


19 


1.90 


11 


21 


21 


1.91 


12 


23 


22 


1.92 


13 


25 


24 


1.92 


14 


26 


26 


1.86 


15 


28 


28 


1.87 


16 


30 


30 


1.88 


17 


32 


32 


1.88 


18 


33 


33 


1.83 


19 


35 


35 


1.84 


20 


37 


37 


1.85 


21 


39 


39 


1.86 


22 


40 


41 


1.82 


23 


42 


43 


1.83 


24 


44 


44 


1.83 


25 


46 


46 


1.84 


26 


48 


48 


1.85 


27 


49 


50 


1.81 


28 


51 


52 


1.82 


29 


53 


54 


1.83 


30 


55 


55 


1.83 


31 


56 


57 


1.81 


32 


58 


59 


1.81 


33 


60 


61 


1.82 


34 


61 


63 


1.79 


35 


63 


65 


1.80 


36 


65 


66 


1.81 


37 


67 


68 


1.81 


38 


68 


70 


1.79 


39 


70 


72 


1.79 


40 


72 


74 


1.80 


41 


74 


76 


1.80 


42 


75 


77 


1.79 


43 


77 


79 


1.79 


44 


79 


81 

O -L 




45 


81 


83 


1.80 


46 


83 


85 


1.80 


47 


84 


87 


1.79 


48 


86 


88 


1.79 


49 


88 


90 


1.80 


50 


90 


92 


1.80 


10000 


17634 


18334 


1.76 



Table 1: Minimum values of q for contraction. 



18 



sequence can be replaced with a purely alternating sequence. Should the purely alternating sequence 
corresponding to the random choices of Multicolour start with V2 or finish with V\, we can 
augment the sequence with a recolouring of V\ at the beginning or V2 at the end respectively. The 
result follows, since the former is equivalent to taking a different starting position in Multicolour, 
and the latter cannot increase the total variation distance from stationarity. □ 

Remark 6.4. Our analysis shows that one-step analysis of a single-site chain on graph colourings 
need not break down at q = 2 A |16| I21j. This apparent "boundary" seems merely to be an artefact 
of using Hamming distance. 

Remark 6.5. Our scan chain can be used to prove polynomial mixing time for the Glauber dynamics 
(with the same values of q and A) by comparison techniques E0j- However, the proof is not 
completely straightforward and will appear elsewhere. 

Remark 6.6. We note that many of the infinite graphs studied in statistical physics are bipartite, for 
example cubic grids and trees. Therefore our results imply, for example, absence of phase transition 
in the antiferromagnetic Potts model in the cubic grid with q colours and dimension d = A/2. A 
proof follows the lines of that given by Vigoda |221 §5] with obvious modifications. Since results 
with similar q, d have been proved by different arguments in 13.;, we omit the details. 
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